Automatically predicting MT systems rankings compatible with Fluency, Adequacy or Informativeness scores

نویسندگان

  • Martin Rajman
  • Tony Hartley
چکیده

The main goal of the work presented in this paper is to find an inexpensive and automatable way of predicting rankings of MT systems compatible with human evaluations of these systems expressed in the form of Fluency, Adequacy or Informativeness scores. Our approach is to establish whether there is a correlation between rankings derived from such scores and the ones that can be built on the basis of automatically computable attributes of syntactic or semantic nature. We present promising results obtained on the DARPA94 MT evaluation corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Ranking of MT Systems

In earlier work, we succeeded in automatically predicting the relative rankings of MT systems derived from human judgments on the Fluency, Adequacy or Informativeness of their output. In this paper, we present an experiment – using human evaluators and additional data – designed to test the robustness of our earlier results. These had yielded two promising automatically computable predictors, t...

متن کامل

Calibrating resource-light automatic MT evaluation

MT systems are traditionally evaluated with different criteria, such as adequacy and fluency. Automatic evaluation scores are designed to match these quality parameters. In this paper we introduce a novel parameter – usability (or utility) of output, which was found to integrate both fluency and adequacy. We confronted two automated metrics, BLEU and LTV, with new data for which human evaluatio...

متن کامل

Calibrating Resource-light Automatic MT Evaluation: a Cheap Approach to Ranking MT Systems by the Usability of Their Output

MT systems are traditionally evaluated with different criteria, such as adequacy and fluency. Automatic evaluation scores are designed to match these quality parameters. In this paper we introduce a novel parameter – usability (or utility) of output, which was found to integrate both fluency and adequacy. We confronted two automated metrics, BLEU and LTV, with new data for which human evaluatio...

متن کامل

A Cluster-Based Representation for Multi-System MT Evaluation

Automatic evaluation metrics are often used to compare the quality of different systems. However, a small difference between the scores of two systems does not necessary reflect a real difference between their performance. Because such a difference can be significant or only due to chance, it is inadvisable to use a hard ranking to represent the evaluation of multiple systems. In this paper, we...

متن کامل

Predicting Machine Translation Adequacy with Document Embeddings

This paper describes USAAR’s submission to the the metrics shared task of the Workshop on Statistical Machine Translation (WMT) in 2015. The goal of our submission is to take advantage of the semantic overlap between hypothesis and reference translation for predicting MT output adequacy using language independent document embeddings. The approach presented here is learning a Bayesian Ridge Regr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001